Hardware and Software Fault Tolerance: Definition and Evaluation of Adaptive Architectures in a Distributed Computing Environment

نویسندگان

  • F. Di Giandomenico
  • A. Bondavalli
  • J. Xu
  • S. Chiaradonna
چکیده

This paper discusses the issue of providing tolerance to both hardware and software faults by defining several hybrid-fault-tolerant architectures, which can co-exist and work simultaneously at the top of the supporting environment, and introduces a systematic method for evaluating their dependability, efficiency and response time. To address general-purpose distributed systems where multiple unrelated applications may compete for system resources, our architectural solutions have an important concern with adaptation in the use of redundancy according to system conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hardware and Software Fault Tolerance: Adaptive Architectures in Distributed Computing Environments

This paper discusses the issue of providing tolerance to hardware and software faults in distributed computing environments as well as issues related to efficiency and flexibility. A set of new fault-tolerant architectures is presented, and a detailed dependability analysis of these architectures is performed together with an efficiency and response time evaluation. The proposed architectural s...

متن کامل

An adaptive approach to achieving hardware and software fault tolerance in a distributed computing environment

This paper focuses on the problem of providing tolerance to both hardware and software faults in independent applications running on a distributed computing environment. Several hybrid-fault-tolerant architectures are identified and proposed. Given the highly varying and dynamic characteristics of the operating environment, solutions are developed mainly exploiting the adaptation property. They...

متن کامل

Adaptive Architectures for Hybrid Fault Tolerance in Distributed Computing Systems

This paper discusses the issue of hardware and software fault tolerance in distributed computing environments as well as issues related to efficiency and flexibility. A set of new fault-tolerant architectures is presented, and a detailed dependability analysis of these architectures together with an efficiency evaluation is performed. The proposed architectural solutions are based on the assump...

متن کامل

Improving the palbimm scheduling algorithm for fault tolerance in cloud computing

Cloud computing is the latest technology that involves distributed computation over the Internet. It meets the needs of users through sharing resources and using virtual technology. The workflow user applications refer to a set of tasks to be processed within the cloud environment. Scheduling algorithms have a lot to do with the efficiency of cloud computing environments through selection of su...

متن کامل

Reliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)

Nowadays, faults and failures are increasing especially in complex systems such as Network-on-Chip (NoC) based Systems-on-a-Chip due to the increasing susceptibility and decreasing feature sizes. On the other hand, fault-tolerant routing algorithms have an evident effect on tolerating permanent faults and improving the reliability of a Network-on-Chip based system. This paper presents reliabili...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000